26 research outputs found

    Dominant speaker detection in multipoint video communication using Markov chain with non-linear weights and dynamic transition window

    Get PDF
    This paper proposes an enhanced discrete-time Markov chain algorithm in predicting dominant speaker(s) for multipoint video communication system in the presence of transient speech. The proposed algorithm exploits statistical properties of the past speech patterns to accurately predict the dominant speaker for the next time state. Non-linear weights-based coefficients are employed in the enhanced Markov chain for both the initial state vector and transition probability matrix. These weights significantly improve the time taken to predict a new dominant speaker during a conference session. In addition, a mechanism to dynamically modify the size of the transition probability matrix window/container is introduced to improve the adaptability of the Markov chain towards the variability of speech characteristics. Simulation results indicate that for an 11 conference participants test scenario, the enhanced Markov chain prediction algorithm registered an 85% accuracy in predicting a dominant speaker when compared to an ideal case where there is no transient speech. Misclassification of dominant speakers due to transient speech was also reduced by 87%

    Cross-domain Transfer Learning and State Inference for Soft Robots via a Semi-supervised Sequential Variational Bayes Framework

    Full text link
    Recently, data-driven models such as deep neural networks have shown to be promising tools for modelling and state inference in soft robots. However, voluminous amounts of data are necessary for deep models to perform effectively, which requires exhaustive and quality data collection, particularly of state labels. Consequently, obtaining labelled state data for soft robotic systems is challenged for various reasons, including difficulty in the sensorization of soft robots and the inconvenience of collecting data in unstructured environments. To address this challenge, in this paper, we propose a semi-supervised sequential variational Bayes (DSVB) framework for transfer learning and state inference in soft robots with missing state labels on certain robot configurations. Considering that soft robots may exhibit distinct dynamics under different robot configurations, a feature space transfer strategy is also incorporated to promote the adaptation of latent features across multiple configurations. Unlike existing transfer learning approaches, our proposed DSVB employs a recurrent neural network to model the nonlinear dynamics and temporal coherence in soft robot data. The proposed framework is validated on multiple setup configurations of a pneumatic-based soft robot finger. Experimental results on four transfer scenarios demonstrate that DSVB performs effective transfer learning and accurate state inference amidst missing state labels. The data and code are available at https://github.com/shageenderan/DSVB.Comment: Accepted at the International Conference on Robotics and Automation (ICRA) 202

    Attestation of Improved SimBlock Node Churn Simulation

    Get PDF
    Node churn, or the constant joining and leaving of nodes in a network, can impact the performance of a blockchain network. The difficulties of performing research on the actual blockchain network, particularly on a live decentralized global network like Bitcoin, pose challenges that good simulators can overcome. While various tools, such as NS-3 and OMNet++, are useful for simulating network behavior, SimBlock is specifically designed to simulate the complex Bitcoin blockchain network. However, the current implementation of SimBlock has limitations when replicating actual node churn activity. In this study, the SimBlock simulator was improved to simulate node churn more accurately by removing churning nodes and dropping their connections and increasing additional instrumentation for validation. The methodology used in the study involved modeling the Bitcoin node churn behavior based on previous studies and using the enhanced SimBlock simulator to simulate node churn. Empirical studies were then conducted to determine the suitability and limitations of the node churn simulation. This study found that the improved SimBlock could produce results similar to observed indicators in a 100-node network. However, it still had limitations in replicating node churn behavior accurately. It was discovered that SimBlock limits all nodes to operate as mining nodes and that mining is simulated in a way that does not depict churn accurately at any time but only at specific intervals or under certain conditions. Despite these limitations, the study’s improvements to SimBlock and the identification of its limitations can be useful for future research on node churn in blockchain networks and the development of more effective simulation tools

    Serverless parallel video combiner with dominant speaker detection for ultra–high definition multipoint video communication systems

    No full text
    The unprecedented shift towards user ubiquity in the 21st century coupled with rapid advancements in computing and network infrastructures have significantly increased the mass adoption of multipoint video communication among consumers, governments and corporations globally. This rapid adoption has brought upon new technical challenges. The first technical challenge focuses on the inefficacies of the conventional centralised based video combiner architecture in terms of its scalability, computational efficiency and image quality. The second technical challenge pertains to the latency in stitching high resolution video frames for ultra-high definition (UHD) display systems in real-time. The third technical challenge emphasises on the variability of speech characteristics amongst conference participants. This variability gives rise to transient speech patterns that result in misclassification of a dominant speaker. This thesis proposes original solutions for the aforementioned challenges to further enhance the performance of a multipoint video communication system

    Active surveillance using depth sensing technology — Part II: Extending tracking range via cascaded depth sensors

    No full text
    In part II of a three-part series on active surveillance using depth-sensing technology, this paper proposes a method which autonomously transfers tracking of a premise intruder from one adjacent Kinect sensor to the other. The proposed method addresses the inherent limited depth range of a Kinect sensor in tracking an intruder further away. This is achieved by continuously monitoring the rotation angle of a Kinect sensor that is mounted on an independent pan-tilt unit (PTU). As the rotation angle approaches a crossover region, the current PTU signals the adjacent PTU to rotate towards this region, at which the adjacent unit then identifies and continues to track the intruder. Experiment results validate the feasibility of the proposed method in tracking an intruder at longer distances through cascaded Kinect sensors

    Active surveillance using depth sensing technology — Part III: Real-time intrusion mapping with remote notification

    No full text
    In the final part of the three-part series on active surveillance using depth-sensing technology, this paper proposes a system that provides both real-time geographical tracking of an intruder and remote alarm notification. This is achieved by first translating both the skeletal depth and rotation angle from a set of cascaded Kinect depth sensors mounted on a pan tilt unit into a geographical coordinate system. These coordinates are then relayed to multiple notification modules, representing a unified remote alarm notification system of a surveilled premise(s). This system also includes a real-time plot of the intruder on a map during the tracking phase and a proximity algorithm to compute the distance between the intruder and each premise. Experiment results validates the feasibility of the proposed system in realizing a unified real-time intruder mapping and notification platform

    Active surveillance using depth sensing technology — Part I: Intrusion detection

    No full text
    In part I of a three-part series on active surveillance using depth-sensing technology, this paper proposes an algorithm to identify outdoor intrusion activities by monitoring skeletal positions from Microsoft Kinect sensor in real-time. This algorithm implements three techniques to identify a premise intrusion. The first technique observes a boundary line along the wall (or fence) of a surveilled premise for skeletal trespassing detection. The second technique observes the duration of a skeletal object within a region of a surveilled premise for loitering detection. The third technique analyzes the differences in skeletal height to identify wall climbing. Experiment results suggest that the proposed algorithm is able to detect trespassing, loitering and wall climbing at a rate of 70%, 85% and 80% respectively

    Active participant identification and tracking using depth sensing technology for video conferencing

    No full text
    Video conferencing represents an effective method of point-to-point or multipoint real-time communication between two or more participants. However, persistent manual adjustments of the video capture device to focus on an active participant represent a challenge, especially if the conference participant moves out of the video capture window. As such, this paper proposes an active-based participant identification and tracking system, which continuously tracks and automatically adjusts the video capture device to maintain focus of the active conference participant. The proposed system first applies a haarcascade face detection algorithm to register and store a set of facial images of the active participant. By leveraging on the depth sensing technology of Microsoft Kinect, this system compares the captured skeletal head position images of participants within the Kinect camera viewpoint, which is then compared against the aforementioned stored face detection images using the principle component analysis face recognition algorithm. The recognized user by the system is then continuously tracked as a skeletal object via a custom designed vertical and horizontal servo controlled motorized system. The custom motorized system sits under the Kinect sensor and is able to achieve 180 degrees in horizontal panning and 22.7 degrees in vertical tilting in line with tracking the movement of the active conference participant

    Software-based serverless endpoint video combiner architecture for high-definition multiparty video conferencing

    No full text
    This paper proposes an endpoint video combiner architecture in a multipoint control unit (MCU) system for high definition multiparty video conferencing. The proposed architecture addresses the current reliability, computational and quality drawbacks of a conventional centralized based video combiner architecture. This is achieved by redesigning the MCU video to move away the video combiner from the bridge and into the client endpoints. Moreover, the proposed architecture represents a serverless system and is able to scale a large number of clients at high resolutions in a multipoint video conferencing session. In order to realize this design, this paper also proposes a custom robust sustainable session management protocol which allows a dynamic multi-port management between the MCU video and client endpoints. In addition, the proposed custom session management protocol includes recommendation for a session protection structure. Experimental results suggest that the proposed architecture exhibits significant computational frame rate performance gains of up to 762.95% in comparison with the conventional centralized video combiner architecture based on a series of four and eight high definition combined video assessments. Moreover, reliability analysis suggests that the proposed architecture is also able to consistently sustain a high frame rate performance within a long duration high definition multipoint video conferencing session
    corecore